Exploring Toxin Genes of Myanmar Russell’s Viper, Daboia siamensis, through De Novo Venom Gland Transcriptomics

The Russell’s viper (Daboia siamensis) is a medically important venomous snake in Myanmar. Next-generation sequencing (NGS) shows potential to investigate the venom complexity, giving deeper insights into snakebite pathogenesis and possible drug discoveries. mRNA from venom gland tissue was extracted and sequenced on the Illumina HiSeq platform and de novo assembled by Trinity. The candidate toxin genes were identified via the Venomix pipeline. Protein sequences of identified toxin candidates were compared with the previously described venom proteins using Clustal Omega to assess the positional homology among candidates. Candidate venom transcripts were classified into 23 toxin gene families including 53 unique full-length transcripts. C-type lectins (CTLs) were the most highly expressed, followed by Kunitz-type serine protease inhibitors, disintegrins and Bradykinin potentiating peptide/C-type natriuretic peptide (BPP-CNP) precursors. Phospholipase A2, snake venom serine proteases, metalloproteinases, vascular endothelial growth factors, L-amino acid oxidases and cysteine-rich secretory proteins were under-represented within the transcriptomes. Several isoforms of transcripts which had not been previously reported in this species were discovered and described. Myanmar Russell’s viper venom glands displayed unique sex-specific transcriptome profiles which were correlated with clinical manifestation of envenoming. Our results show that NGS is a useful tool to comprehensively examine understudied venomous snakes.


Introduction
Russell's viper is a widespread species found in South and Southeast Asia, as well as China. Based on previous molecular [1] and morphological [2] studies, the Daboia genus has been divided into two well-defined taxa: D. russelii (west of the Bengal Bay) and D. siamensis (east of the Bengal Bay). The species designation and regional distribution align with variations observed in the clinical features of Russell's viper envenomation [2,3]. The venom proteomes of Russell's viper vary across their geographical distribution and have been attributed to these differences [4][5][6][7][8][9].
Snake venoms are mixtures of proteins working synergistically to produce various toxic functions. These compounds evolved to immobilize, kill and initially digest their preys [10,11]. The myriad toxin compositions and functions are a byproduct of their interrelation between natural selection and biochemical processes shaping the protein-protein

Categorization of Transcripts and Gene Expression
The expression levels of toxin transcripts from both samples were compared with those of non-toxin transcripts from tissues of the same species (Bioproject PRJNA269317, Biosamples SAMN03081270) (Supplementary Tables S1 and S2).
The toxin transcripts were confirmed to be highly expressed in venom gland tissues. They were categorized according to TPM values. The individual toxin transcripts with a TPM value of more than 100 were categorized as highly expressed, a TPM value of 10 to 100 as moderately expressed and a TPM value of less than 10 as lowly expressed toxin groups. Only highly expressed groups were considered for further detailed evaluation and description in this study. Non-toxin full-length transcripts, such as glutaminyl-peptide cyclotransferase (highly expressed), tomoregulin-2 (moderately expressed) and endothelinconverting enzyme (lowly expressed), were also identified in the current transcriptome. Highly expressed transcripts were no less than 150-fold higher in the venom gland when compared to samples comprising other tissues (brain, kidney, heart, lung, spleen and liver).
The highly expressed venom transcripts were categorized into 23 toxin gene families, including 53 unique full-length sequences. The most prevalent transcripts were C-type lectins (CTLs) (26%), Kunitz-type protease inhibitors (KSPIs) (19%), disintegrins (18%) and bradykinin potentiating peptide/C-type natriuretic precursors (BPP-CNPs) (16%). The remaining 21% consisted of phospholipase A 2 (PLA 2 ), serine proteases, metalloproteinases, vascular endothelial growth factor (VEGF), L-amino acid oxidase (LAAO) and cysteine-rich secretory proteins (CRISPs) (Figure 1). venom gland when compared to samples comprising other tissues (brain, kidney, heart, lung, spleen and liver). The highly expressed venom transcripts were categorized into 23 toxin gene families, including 53 unique full-length sequences. The most prevalent transcripts were C-type lectins (CTLs) (26%), Kunitz-type protease inhibitors (KSPIs) (19%), disintegrins (18%) and bradykinin potentiating peptide/C-type natriuretic precursors (BPP-CNPs) (16%). The remaining 21% consisted of phospholipase A2 (PLA2), serine proteases, metalloproteinases, vascular endothelial growth factor (VEGF), L-amino acid oxidase (LAAO) and cysteine-rich secretory proteins (CRISPs) (Figure 1). When comparing males and females as shown in Figure 2, the male sample had CTLs, KSPIs, BPP-CNPs and disintegrins as the most highly expressed toxin groups, while metalloproteinases, CTLs, BPP-CNPs and KSPIs were the top highly expressed toxin groups in female snakes. The expression levels of transcripts for most male toxin groups were higher than those in the female-mapped reads, with 1.39-to 11.68-fold differences. However, the number of transcripts for serine proteases, glutaminyl-peptide cyclotransferase-1 and metalloproteinases, were 1. 16 When comparing males and females as shown in Figure 2, the male sample had CTLs, KSPIs, BPP-CNPs and disintegrins as the most highly expressed toxin groups, while metalloproteinases, CTLs, BPP-CNPs and KSPIs were the top highly expressed toxin groups in female snakes. The expression levels of transcripts for most male toxin groups were higher than those in the female-mapped reads, with 1.39-to 11.68-fold differences. However, the number of transcripts for serine proteases, glutaminyl-peptide cyclotransferase-1 and metalloproteinases, were 1.16, 1.22 and 2.33 times higher in female than male venom glands, respectively. Some toxin transcripts, such as cathelicidin-related protein (Vipericidin) and techylectin-like protein (Veficolin), were only identified in male samples.
Our sequence alignment (Supplementary Figure S1) identified that these sequences contained six conserved cysteine residues (typically eight in total for each sequence), along with a WIGL conserved motif for CTL and the WND Gla-domain binding site.
Our sequence alignment (Supplementary Figure S1) identified that these sequences contained six conserved cysteine residues (typically eight in total for each sequence), along with a WIGL conserved motif for CTL and the WND Gla-domain binding site.

Kunitz-Type Serine Protease Inhibitors
A total of 15 Kunitz-type serine protease inhibitor (KSPI) transcripts were identified, including seven full-length and eight partial-length transcripts. Thirteen unique isoforms of KSPIs were observed (Table 2).

RTS-Disintegrins
A total of three full-length disintegrin transcripts were identified, including two isoforms of RTS-containing disintegrins which were also described as Dis1a and Dis1b in our previous paper [25]. In the gene tree analysis, Dis1a shared a high sequence similarity with jerdostatin from Protobothrops jerdonii (Q7ZZM2), while Dis1b was in the same branch as acostatin-α from D. siamensis (AUF416558) (Supplementary Figure S2). In the sequence alignment, the N-terminal sequence of Dis1b was more similar to that of ascostatin-α than jerdostatin and it possessed an RTS motif instead of an RGD motif ( Figure 4).

RTS-Disintegrins
A total of three full-length disintegrin transcripts were identified, including two isoforms of RTS-containing disintegrins which were also described as Dis1a and Dis1b in our previous paper [25]. In the gene tree analysis, Dis1a shared a high sequence similarity with jerdostatin from Protobothrops jerdonii (Q7ZZM2), while Dis1b was in the same branch as acostatin-α from D. siamensis (AUF416558) (Supplementary Figure S2). In the sequence alignment, the N-terminal sequence of Dis1b was more similar to that of ascostatin-α than jerdostatin and it possessed an RTS motif instead of an RGD motif ( Figure 4).

Bradykinin-Potentiating Peptides and C-Type Natriuteric Peptide (BPP-CNP) Precursors
A total of four full-length BPP-CNP transcripts were identified in this study. In Figure 5, the repeated BPP units were present in a BPP-CNP precursor. These BPP units appeared in tripeptide metalloproteinase inhibitors. CNPs were found to be variable between these sequences. The sequence alignments of the natriuretic peptide domains of Myanmar Russell's viper BPP-CNP precursors with those of human and other snakes are shown in Figure 6.

Phospholipase A 2 (PLA 2 )
In the focal transcriptomes, there were five full-length and two partial-length PLA 2 candidates. Five full-length transcripts contain three group II PLA 2 and two group III PLA 2 transcripts. Three of the group II PLA 2 s (acidic and basic PLA 2 ) were highly expressed (TPM >10,000-14,000), followed by two group III PLA 2 transcripts (TPM 14-35), whereas two partial-length transcripts, group IIF PLA 2 homologs of Crotralus atrox and a crotoxin homolog from Protobothrops guttatus, were the lowest expressed transcripts (TPM 1-8). The highly expressed group II PLA 2 transcripts were aligned with PLA 2 from Eastern Russell's viper PLA 2 s, DsM b1and daboiatoxin A from Myanmar, PLA 2 -I from China and RV-7 from Taiwan ( Figure 7). A total of four full-length BPP-CNP transcripts were identified in this study. In Figure 5, the repeated BPP units were present in a BPP-CNP precursor. These BPP units appeared in tripeptide metalloproteinase inhibitors. CNPs were found to be variable between these sequences. The sequence alignments of the natriuretic peptide domains of Myanmar Russell's viper BPP-CNP precursors with those of human and other snakes are shown in Figure 6.  Conserved cysteine residues are underlined, while relatively conserved amino acid residues known for NPR binding are highlighted. (*) fully conserved residues; (:) strongly similar residues; (.) weakly similar residues. Important residues for NP receptor binding are highlighted in yellow color with variable residues are in green and pink color.

Phospholipase A2 (PLA2)
In the focal transcriptomes, there were five full-length and two partial-length PLA2 candidates. Five full-length transcripts contain three group II PLA2 and two group III

Phospholipase A2 (PLA2)
In the focal transcriptomes, there were five full-length and two partial-length PLA2 candidates. Five full-length transcripts contain three group II PLA2 and two group III PLA2 transcripts. Three of the group II PLA2s (acidic and basic PLA2) were highly whereas two partial-length transcripts, group IIF PLA2 homologs of Crotralus atrox and a crotoxin homolog from Protobothrops guttatus, were the lowest expressed transcripts (TPM 1-8). The highly expressed group II PLA2 transcripts were aligned with PLA2 from Eastern Russell's viper PLA2s, DsM b1and daboiatoxin A from Myanmar, PLA2-I from China and RV-7 from Taiwan ( Figure 7). (.) weakly similar residues.

Snake Venom Serine Proteases (SVSPs)
There were 14 transcripts (2 full-length and 12 partial-length) annotated as serine proteinases that were highly expressed (TPM 448.14-5495.65). Four transcripts gave an annotation as RVV-Vγ of D. siamensis (Supplementary Figure S3), three transcripts as αfibrinogenases and one transcript as β-fibrinogenase of D. siamensis. Four transcripts were homologous to VLSP-1 ( Figure 8) and one transcript to VLSP-3 from Macrovipera lebetina. The remaining one was partially similar to VSPBF-DSABSI from D. siamensis and partially

Snake Venom Serine Proteases (SVSPs)
There were 14 transcripts (2 full-length and 12 partial-length) annotated as serine proteinases that were highly expressed (TPM 448.14-5495.65). Four transcripts gave an annotation as RVV-V γ of D. siamensis (Supplementary Figure S3), three transcripts as αfibrinogenases and one transcript as β-fibrinogenase of D. siamensis. Four transcripts were homologous to VLSP-1 ( Figure 8) and one transcript to VLSP-3 from Macrovipera lebetina. The remaining one was partially similar to VSPBF-DSABSI from D. siamensis and partially to VLSP-3 from Macrovipera lebetina. We could not find out the exact similarity due to the partial-length nature of the remaining transcripts (Table 3). to VLSP-3 from Macrovipera lebetina. We could not find out the exact similarity due to the partial-length nature of the remaining transcripts (Table 3).

Snake Venom Metalloproteinases (SVMPs)
In the current transcriptomes, two full-length transcripts for RVV-X heavy chain (Russell's viper factor X activator, A0A2H4Z2W1) and two partial-length DSAIP (Daboia siamensis VLAIP-like SVMP, A0A2H4Z2Y9) are the most highly expressed SVMP transcripts (>1.0-1.8 × 10 4 TPM). Of the remaining three full-length transcripts with TPM values of 15-47, c15329_g2_i1_M_FL and c10072_g2_i1_F_FL were homologous to the A disintegrin and metalloproteinase domain (ADAM)-containing protein 10 from Vipera anatolica senliki (A0A6G5ZVR9) (Figure 9), and c19271_g2_i1_M_FL was homologous to SVMP from Agkistrodon contortrix contortrix (A0A1W7RJN7) (Figure 10). All conceptually translated proteins contained the signal peptide, a metalloproteinase domain containing a Zn + -binding catalytic site and a disintegrin domain. There was an additional ADAM17 MPD domain containing a CxxC motif in the c19271_g2_i1_M_FL transcript.

Snake Venom Metalloproteinases (SVMPs)
In the current transcriptomes, two full-length transcripts for RVV-X heavy chain (Russell's viper factor X activator, A0A2H4Z2W1) and two partial-length DSAIP (Daboia siamensis VLAIP-like SVMP, A0A2H4Z2Y9) are the most highly expressed SVMP transcripts (>1.0-1.8 × 10 4 TPM). Of the remaining three full-length transcripts with TPM values of 15-47, c15329_g2_i1_M_FL and c10072_g2_i1_F_FL were homologous to the A disintegrin and metalloproteinase domain (ADAM)-containing protein 10 from Vipera anatolica senliki (A0A6G5ZVR9) (Figure 9), and c19271_g2_i1_M_FL was homologous to SVMP from Agkistrodon contortrix contortrix (A0A1W7RJN7) (Figure 10). All conceptually translated proteins contained the signal peptide, a metalloproteinase domain containing a Zn + -binding catalytic site and a disintegrin domain. There was an additional ADAM17 MPD domain containing a CxxC motif in the c19271_g2_i1_M_FL transcript.

L-Amino Acid Oxidases (LAAOs)
In this study, two full-length and one partial-length LAAO transcripts with 99.93% identity (84% query coverage, E value 0.0) to DrLAO (G8XQX1) were identified. The sequence alignment is shown in Supplementary Figure S5.

Cysteine-Rich Venom Proteins (CRISPs)
There were two full-length highly expressed cysteine-rich secretory protein (CRISP) transcripts identified in the current transcriptomes which gave the annotation of Dr-CRPK from D. russelii (ACE73567), currently termed D. siamensis (Supplementary Figure S6). These deduced amino acid sequences were also identical to those of CRISPs (MRV CRISP) (JZ980958, JZ980965 and JZ980970), which were expressed in previous cDNA libraries [26].

Discussion
The de novo venom gland transcriptomic analysis provides a unique profile of the toxin composition of Myanmar D. siamensis venom and its suggestive toxicogenomic diversity. Our previous study on the venom gland cDNA library revealed unique gene expression patterns [26]; however, the use of NGS technology allowed us to provide a broader examination into the variation, distribution and diversity of toxin transcripts. Using Trinity as our de novo transcriptome assembly program, we were able to assemble a robust transcriptome using high-quality (>98% Q20 for both male and female) raw reads and subsequent mapping for gene expression analysis. Our initial exploration into these transcriptomes relied on RSEM and gene expression information to identify focal toxin candidate transcripts; however, future studies would benefit from exploring additional assembly programs providing a multi-faceted approach to analyzing these transcriptomes and account for chimeric transcripts [27]. The venom proteome of D. siamensis comprised six toxin families, including serine proteinases, metalloproteinases, PLA 2 , LAAO, VEGF and CTLs [4]. The BPP-CNP transcripts containing SVMPIs were identified only in the transcriptomes due to the co-expression of SVMPI tripeptides and atrial natriuretic peptides in the same transcripts [28]. Only the processed products could be detected in the venom. Low-molecular-weight disintegrins (<10 kDa) were detected only in a transcriptome because a proteomic study could identify proteins larger than 10 kDa. The reason that CRISPs (27 kDa) were not found in the proteome was unclear.
We found sex-related differences in venom transcript expression in Russell's viper, also detected in Bothrops jararaca [12], Thamnophis elegans [17] and Mesobuthus martensii [29]. These differences reflected the sex-based variations of the venom proteome and were closely associated with the biochemical and biological properties of different genders in the same species [14,[30][31][32]. In B. jararaca and B. moojeni, the variability in the venom composition was associated with animal sizes, as females are larger than males [32,33]. This coincided with the sexual dimorphism of the venom donors used in the study, as females (average weight in ounces × total length in inches x girth in inches: 49.9 × 47.8 × 6.5) were larger than males (29.8 × 47 × 5.3) and more venom was collected from females. There is also a correlation between venom yield and the length of Myanmar Russell's vipers, which is related to the severity of envenoming, time for coagulopathy development and the severity of local manifestations [34].
The transcripts corresponding to the toxin groups described below were found to be most abundant in both sexes and aligned with Myanmar Russell's viper envenomation symptoms. Snake venom C-type lectins (snaclecs) are heterodimers of two closely related α and β subunits [35]. They target factor IX, factor X, prothrombin or α-thrombin (as pro-or anticoagulant) and platelet receptors, such as glycoprotein (Gp) Ibα, Gp Ia/IIa, or Gp VI (as pro-or antithrombotic) [36,37]. Snaclecs activate platelets and thrombus formation causing thrombo-inflammation during viper envenoming [38]. Dabocetin α together with its β subunit interacts with platelet GpIb receptor, inhibiting platelet aggregation [39]. D. siamensis Snaclec-7 is a β subunit of a GpIb-binding protein [40], and together with dabocetin α is involved in platelet inhibition. The M. lebetina Snaclec A12 homolog, originally found in D. siamensis, has a stronger sequence similarity to α than β subunits compared to other heterodimeric snaclecs [41].
Snake venom protease inhibitors primarily disrupt blood coagulation resulting in blood loss and the death of preys [42]. The largest and most widely distributed family is Kunitz-type protein inhibitors, which strongly inhibit serine proteases. They contain 58-60 amino acid residues, including three disulfide bridges [43]. Several Kunitz-type protease inhibitors from Daboia species have previously been identified.
The snake venom disintegrins are a group of small cysteine-rich proteins found in a variety of snake families. They are divided into seven groups according to disulfide patterns and sequence signatures [44]. Jerdostatin homologs (Dis1a and Dis1b) containing eight cysteines belong to group 7A. Most of these small disintegrins are synthesized from short-coding genes, instead of a proteolytic process from PII SVMPs like the majority of disintegrins [45]. The RTS-disintegrins are expressed from the RPTLN gene and have a high conserved signal peptide sequence with SVMP genes, suggesting they may share tissue-specific regulatory elements [46,47]. The disintegrin motif is responsible for binding specificity to integrin receptors [48]. Modulating disintegrin motifs is interesting for the development of antiangiogenesis, antimetastasis, antiproliferative and proapoptotic agents for cancer treatments. Lowly expressed VGD-and MLD-containing dimeric disintegrins observed in our previous study [25] were not evaluated in the current study as the candidate transcripts had TPM values lower than 100 and were therefore not analyzed.
The BPP-CNP transcripts usually carry a tripeptide repeat together with bradykininpotentiating peptides (BPPs) and natriuretic peptide (NP) precursors [28,49]. The two tripeptides (pERW and pEKW) from Myanmar Russell's viper completely inhibited the gelatinolytic effects of RVV-X and the fibrinogenolytic activities of daborhagin [28]. These viper-restricted inhibitory peptides protected the venom gland from autodigestion and prevented hydrolysis of other proteins in venom glands [50]. Their inhibitory activity towards the SVMP Zn 2+ binding site [51] has made them popular targets for drug design in areas of neurodegenerative and cardiovascular diseases. Natriuretic peptides (NPs) in vertebrates promote salt secretion, regulating cardiovascular and renal systems. Three mammalian NPs are atrial natriuretic peptide (ANP), B-type natriuretic peptide (BNP) and C-type natriuretic peptide (CNP). ANP and BNP are released by cardiomyocytes in response to hypertension and hypervolemia, and CNP is produced by endothelial cells. The sequence alignment of various snake venom and human natriuretic peptides is shown in Figure 6, where five residues (F, D, R, I/V and L/F) are crucial for receptor binding [50]. Among MRV CNP transcripts, one had the same conserved amino acid residues, while remaining three possessed V instead of I and F instead of L residues. These variations should be explored in relation to their roles in terms of toxicity and their potential therapeutic properties. NPs have their specific niche in medical research in the treatment of heart failure and hypertension [51,52].
Phospholipases A 2 (PLA 2 ) catalyze a hydrolysis of glycerophospholipids, fatty acids and lysophospholipids and release arachidonic acids to trigger inflammation. Snake venom PLA 2 s (svPLA 2 ) are small, calcium-dependent, secreted enzymes ranging from 12 to 14 kDa. The class I PLA 2 s are found in Elapidae and Hydrophiidae, while class II are in Crotalidae and class III are in lizard and bee venoms [53]. There are two major types in Russell's vipers. PLA 2 s containing an N-terminal Asn (N) are found in species from China, Thailand, Myanmar and Pakistan, while the venoms of southern Indian and Sri Lankan species contain PLA 2 with an N-terminal serine (S). They confer hypotensive with neurotoxic and myonecrotic properties, respectively [26]. Furthermore, Russell's viper PLA 2 showed nephrotoxicity in mice [54]. All of the three PLA 2 transcripts from the current transcriptomes contained N-terminal Asparagine (N) and conserved Aspartic acid (D) at the 49 position. In all transcripts, 14 cysteine residues for seven conserved disulfide bonds indicated that they were GIIA PLA 2 s. They all possessed Thr41 which was found to be conserved in all the neurotoxic or myotoxic PLA 2 s. From Myanmar D. siamensis venom, DsM b1 had neurotoxic and lethal effects [55] and dabotoxin showed neurotoxic, myotoxic, cytotoxic, edema-inducing and indirect hemolytic activities [56].
Snake venom serine proteases (SVSPs) have two six-stranded β-barrels and a catalytic triad of His57, Asp102 and Ser159. Viper venoms contain a large variety of SVSPs, mostly as monomeric glycoproteins of 26 to 67 kDa. They act on almost all components of hemostatic and fibrinolytic systems and are responsible for the hemorrhage and shock after envenomation. Based on their biological roles, they are classified as activators of the fibrinolytic, procoagulant, anticoagulant and platelet-aggregating enzymes [57][58][59]. SVSPs from D. russelii are procoagulants that (i) activate factor V (RVV-V), (ii) have thrombin-like action causing defibrinogenation (Russelobin) and (iii) are αβ fibrinogenases with factor V-activation (RV-FVPα-δ). RV-FVPs showed in vivo anticoagulant effects in addition to in vitro procoagulant effects [60].
Snake venom metalloproteinases (SVMPs) are zinc endopeptidases varying from 20 to 100 kDa. They are the major components of viper venom, varying from 11% to 65% of the whole venom cocktail [61,62]. SVMPs have sequence and domain similarities to ADAM (A disintegrin and metalloproteinase) and can be grouped into three major classes (PI-PIII) according to their organization [53,63]. SVMPs are responsible for local and systemic hemorrhage, myonecrosis, the disruption of hemostasis mediated by procoagulant or anticoagulant effects or platelet aggregation, tissue necrosis and pro-inflammation [61]. These activities are mediated by metalloproteinase, disintegrin and/or disintegrin-like domains [62]. Snake venom metalloproteinases (SVMPs) isolated from Russell's viper were RVV-X (85 kDa), Daborhagin (65 kDa), VRR-73 (73 kDa), VRH-1 (23 kDa) and RVBCMP (25 kDa). RVV-X is a well-characterized class P-IIId SVMP. It activates factor X via a single cleavage at the Arg 194 -Ile 195 bond in factor X, which is responsible for disseminated intravascular coagulation (DIC) in snakebite patients [64]. RVV-X also binds to receptors on platelets and endothelial cells, resulting in clinical bleeding [60,65] and contributing to acute kidney injury in animal models [66]. The structure and biological activity of DSAIP still needs to be elucidated. RVV-X and daborhagin are found in both Myanmar and Indian RV species, while VRR-73, VRH-1 and RVBCMP are from only Indian RV [61]. Daborhagin-M is a P-IIIc hemorrhagic SVMP with high proteolytic activity toward fibrinogen, fibronectin and collagen [67]. These activities may be related to the severe bleeding in the patients.
A disintegrin and metalloproteinase (ADAM) family proteins in focal transcriptomes were expressed at moderate levels. Transcripts homologous to VIPAN (ADAM 10) from V. a. senliki and AGKCO (ADAM 17) from A. c. contortrix were found in Myanmar Russell's viper. ADAM 10 played a major role in the signaling pathway and ADAM 17 in the processing of tumor necrosis factor-α and other cell-surface molecules [68]. Although ADAM 17 is considered to be the primary sheddase for TNF-α, ADAM 10 also worked as a TNF-α sheddase in certain cell types [69]. The biological functions of ADAM 10 and 17 from snakes require further investigation.
Snake venom vascular endothelial growth factors (svVEGFs) play important roles in angiogenesis (endothelial growth and proliferation), hypotension and vascular permeability by binding to VEGF receptors. svVEGFs are 25-kDa, homodimeric heparin-binding proteins with 50% identity to VEGF-A165. They have a VEGF homology domain (VHD), with eight conserved cysteines forming a cysteine knot. The C-terminal variable region corresponds to heparin and NP-1 (coreceptor) binding sites [70,71]. VR-1, a svVEGF from D. russelii, increased endothelial cell proliferation, vascular permeability and hypotension through the specific activation of endothelial KDR (kinase insert domain-containing receptor) [72]. The VR-1 isolated from D. siamensis also increased endothelial cell proliferation [73]. The markedly enhanced vascular permeability caused by svVEGFs can lead to hypotension and shock.

Conclusions
The de novo venom gland transcriptomic analysis provided a unique profile of venom gene expression in D. siamensis from Myanmar. The expression patterns of toxin-encoding genes showed high structural redundancy in comparison to non-toxin genes, suggesting distinct functions of the variants in each group. The expression patterns of venom genes were also sex-specific. The distribution and expression levels of the principal toxin components yielded deeper insights into the toxic syndromes of Russell's viper envenoming, which included bleeding, hypotension and renal failure. Moreover, the full-length toxin sequences from this study would be the references for future proteomic analysis. Newly identified venom components will be a source for future structure-functional relationship studies and the development of new therapeutic modalities. Taking into account that the genomes of these species were not yet completed, this study provides rich data for a fast advancement in toxicogenomic research to study evolutionary origins and diversity in snakes.

Sample Preparation
Four (two male and two female) adult Russell's viper, D. siamensis, were captured from Kyun Gyan Gon Township, Yangon, Myanmar. Venom milking was carried out prior to tissue harvesting to stimulate the toxin transcription. The snakes were allowed to rest for four days to maximize RNA transcription. The venom glands were then removed, cut into pieces smaller than 5 × 5 mm and stored in an RNAlater solution (Ambion, Austin, TX, USA) at 4 • C overnight before transferring to −80 • C.

RNA Extraction, cDNA Library Construction and Sequencing
Total RNA from venom gland tissues was extracted using Trizol reagent (Life Technologies, Carlsbad, CA, USA) or a Total RNA Purification Kit (Jena Bioscience GmbH, Jena, Germany). The sample were stored at −80 • C. Subsequently, mRNA from samples was isolated using PolyAT Tract mRNA Isolation System (Promega, Madison, WI, USA) or a FastTrack MAG mRNA Isolation Kit (Invitrogen, Carlsbad, CA, USA). Both systems used MagneSphere ® technology. Subsequently, mRNA was precipitated, concentrated using 3M sodium acetate and isopropanol and stored at −80 • C. Isolated mRNA was suspended in RNase-free water and sent to Macrogen Inc. Geumchun-gu, Seoul, Republic of Korea. The mRNA sequencing libraries were prepared using the TruSeq RNA sample preparation kit (Illumina Inc., San Diego, CA, USA) with selected insert sizes of 200-400 bp. Libraries were then sequenced on the Illumina HiSeq2000 platform using (2 × 100 bp) paired-end reads.

De Novo Transcriptome Assembly and Expression Quantification
The quality of raw data or raw reads from the Illumina platform was assessed with the quality assessment software FastQC (http://www.bioinformatics.babraham.ac.uk/ projects/fastqc; accessed on 18 November 2014) and cleaned up using Trimmomatric (0.32) (http://www.usadellab.org/cms/?page=trimmomatic; accessed on 20 November 2014) to remove adaptor sequences, nucleotide errors and low-quality sequences. Two assembly tools, Trinity [24] and CLC Genomic Workbench 7.5 (CLC bio, a QIAGEN Company, Aarhus, Denmark), were used to assemble and compare alternative pathways for de novo transcriptome assemblies. The quality assessment tool QUAST was used to evaluate the quality of the assembled transcriptomes [74], indicating that the Trinity assembly had a higher quality score than the CLC genomic assembly (Supplementary Table S4). Thus, assembled data via Trinity were used for downstream analysis.
The transcriptomes were assembled in the Trinity software (r20140717) [24]. For each transcriptome, expression values were determined as transcripts per million (TPM) and fragments per kilobase million (FKPM) were calculated in the program RSEM [75,76] as part of the Trinity package. The raw data of the assembly were uploaded to NCBI database under BioProject PRJNA545823, Biosample SAMN11938797 for males and SAMN11939170 for females. The expression levels in terms of both TPM and FKPM of toxin transcripts were mapped to transcripts from non-venom gland tissues, such as brain, kidney, heart, lung, liver and spleen (Bioproject PRJNA269317, Biosamples SAMN03081270), of D. siamensis.

Venom Gland Transcript Classification
Toxin transcripts from the two transcriptomic datasets were identified and characterized using Venomix [77]. The Venomix pipeline identified candidate toxin genes based on BLAST hits (E-value 1 × 10 −20 ) against the ToxProt dataset [78]. A transcript was considered a "candidate" if the transcript had a lower E-value associated with a toxin than with a non-toxin protein in Uniprot. The candidate transcripts were translated into their protein sequences using Transdecoder [79] and further evaluated in ToxClassifier [80]. A predicted protein sequence with a score of >1 was considered a toxin candidate. The annotation of generated transcript sequences was aligned by standalone BLASTP (v. 2.10.0) to the protein database UniProtKB/Swiss-Prot [81] (Swissprot, Swiss Institute of Bioinformatics, Lausanne, Switzerland) of Serpentes (taxid: 8570) organism. Sequences with no matches were then searched against non-redundant protein sequences (nr) of the same organism. Proteins with the highest ranks in the BLASTP results were referenced to determine the functional annotations.
The transcripts from this study were named as "c" (component/contig), "g"' (subgraph/trinity gene) and "i" (path sequence/trinity isoform) followed by numbers. The letter "F" stood for female, "M" for male, "PL" for partial-length and "FL" for full-length transcripts.

Multiple Sequences Alignment and Gene Tree Reconstruction
The most highly toxin transcripts (TPM values ≥ 100) were conceptually translated into protein sequences and subjected to alignment using Clustal Omega [82]. Alignments of the first-described and previously reported sequences in Russell's viper are displayed in the manuscript and Supplementary Material, respectively. The conserved domains and functional motifs are identified and referenced in the figures. For some toxin gene families, the alignments were used to reconstruct in the program RAxML [83] using the PROTGAMMABLOSUM62 model with 1000 bootstrap replicates.
Supplementary Materials: The following supporting information can be downloaded at https: //www.mdpi.com/article/10.3390/toxins15050309/s1, Figure S1: Clustal Omega sequence alignment of the polypeptides encoded by 23 transcript candidate C-type lectins) from Myanmar D. siamensis; Figure S2: The gene tree analysis of the disintegrins from Myanmar D. siamensis, Dis1a (c13890_g2_i1_M) and Dis1b (c11797_g1_i1_F), with disintegrins from other species; Figure S3: Sequence alignment of RVV-V γ from Daboia siamensis and deduced amino acid sequences of transcripts from the transcriptomes; Figure S4: Sequence alignment of deduced amino acid sequences of VEGF transcripts with VR-1 from D. russelii (P67861); Figure S5: Sequence alignment of DrLAO (Daboia russelii, G8XQX1.1) with three deduced LAAO transcripts obtained from Myanmar Russell's viper transcriptome; Figure S6: Sequence alignment of deduced amino acid sequences of CRISP transcripts from the transcriptomes with cysteine-rich secretory protein Dr-CRPK from Taiwan Russell's viper (ACE73567); Table S1: Comparison of expression levels of toxin transcripts from male sample with non-toxin transcripts from tissues of D. siamensis; Table S2: Comparison of expression levels of toxin transcripts from female sample with non-toxin transcripts from tissues of D. siamensis; Table S3: Transcripts with snake venom C-type lectins (CTLs) annotation; Table S4; QUAST reports of assembled transcriptomes by two different assemblers; Table S5: List of full-length sequences with toxin gene families of Myanmar Russell's viper transcriptome. Reference [84] are cited in the supplementary materials.  Informed Consent Statement: Not applicable.

Data Availability Statement:
The raw data of the transcriptome assembly were uploaded to NCBI database under BioProject PRJNA545823, Biosample SAMN11938797 for males and SAMN11939170 for females.